Search CORE

20 research outputs found

Ambiguity Detection and Textual Claims Generation from Relational Data

Author: Donatello Santoro
Enzo Veltri
Gilbert Badaro
Mohammed Saeed
Paolo Papotti
Publication venue: CEUR-WS
Publication date: 01/01/2022
Field of study

Archivio della Ricerca - Università della Basilicata

Pythia: Unsupervised generation of ambiguous textual claims from relational data

Author: Badaro Gilbert
Papotti Paolo
Saeed Mohammed
Santoro Donatello
Veltri Enzo
Publication venue: country:USA
Publication date: 01/01/2022
Field of study

Applications such as computational fact checking and data-to-text generation exploit the relationship between relational data and natural language text. Despite promising results in these areas, state of the art solutions simply fail in managing “data-ambiguity”, i.e., the case when there are multiple interpretations of the relationship between the textual sentence and the relational data. To tackle this problem, we introduce Pythia, a system that, given a relational table D, generates textual sentences that contain factual ambiguities w.r.t. the data in D. Such sentences can then be used to train target applications in handling data-ambiguity. In this demonstration, we first show how our system generates data ambiguous sentences for a given table in an unsupervised fashion by data profiling and query generation. We then demonstrate how two existing applications benefit from Pythia’s generated sentences, improving the state-of-the-art results. The audience will interact with Pythia by changing input parameters in an interactive fashion, including the upload of their own dataset to see what data ambiguous sentences are generated for it

Archivio della Ricerca - Università della Basilicata

Transformers for Tabular Data Representation: A Survey of Models and Applications

Author: Gilbert Badaro
Mohammed Saeed
Paolo Papotti
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2023
Field of study

AbstractIn the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this article, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions

Directory of Open Access Journals

Transformers for Tabular data representation: A tutorial on models and applications

Author: Badaro Gilbert
Publication venue: 'VLDB Endowment'
Publication date: 05/09/2022
Field of study

EURECOM Repository

Transformers for tabular data representation: A survey of models and applications

Author: Badaro Gilbert
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 15/01/2023
Field of study

EURECOM Repository

Transformers for Tabular data representation: A survey of models and applications

Author: Badaro Gilbert
Publication venue: EURECOM
Publication date: 27/10/2021
Field of study

EURECOM Repository

Transformers for Tabular Data Representation: A Survey of Models and Applications

Author: Badaro Gilbert
Papotti Paolo
Saeed Mohammed
Publication venue: The MIT Press
Publication date: 02/01/2023
Field of study

International audienceIn the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this work, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions

Hal-Diderot

Transformers for Tabular Data Representation: A Survey of Models and Applications

Author: Badaro Gilbert
Papotti Paolo
Saeed Mohammed
Publication venue: The MIT Press
Publication date: 02/01/2023
Field of study

HAL Descartes

Recommender systems using harmonic analysis

Author: Badaro Gilbert
El-Hajj Wassim
Haddad Ali
Hajj Hazem
Shaban Khaled Bashir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Recommender systems provide recommendations on variety of personal activities or relevant items of interest. They can play a significant role for E-commerce and in daily personal decisions. However, existing recommender systems still face challenges in dealing with sparse data and still achieving high accuracy and reasonable performance. The issue with missing rating leads to inaccuracies when trying to match items or users for rating prediction. In this paper, we propose to address these challenges with the use of Harmonic Analysis. The paper extends on our previous work, and provides a comprehensive coverage of the method with additional experiments. The method provides a novel multiresolution approach to the user-item matrix and extracts the interplay between users and items at multiple resolution levels. New affinity matrices are defined to measure similarities among users, among items, and across items and users. Furthermore, the similarities are assessed at multiple levels of granularity allowing individual and group level similarities. These affinity matrices thus produce multiresolution groupings of items and users, and in turn lead to higher accuracy in matching similar context for ratings, and more accurate prediction of new ratings. The evaluation of the system shows superiority of the solution compared to state of the art solutions for user-based collaborative filtering and item-based collaborative filtering. 2014 IEEE.Qatar National Research FundScopu

Qatar University Institutional Repository

A multiresolution approach to Recommender systems

Author: Badaro Gilbert
El-Hajj Wassim
Haddad Ali
Hajj Hazem
Shaban Khaled Bashir
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Recommender systems face performance challenges when dealing with sparse data. This paper addresses these challenges and proposes the use of Harmonic Analysis. The method provides a novel approach to the user-item matrix and extracts the interplay between users and items at multiple resolution levels. New affinity matrices are defined to measure similarities among users, among items, and across items and users. Furthermore, the similarities are assessed at multiple levels of granularity allowing individual and group level similarities. These affinity matrices thus produce multiresolution groupings of items and users, and in turn lead to higher accuracy in matching similar context for ratings, and more accurate prediction of new ratings. Evaluation results show superiority of the approach compared to state of the art solutions.NPRP 6-716-1-138 grant from the Qatar National Research Fund (a member of Qatar Foundation).Scopu

Qatar University Institutional Repository